Predefined pattern detection in large time series

نویسندگان

  • Shengfa Miao
  • Ugo Vespier
  • Ricardo Cachucho
  • Marvin Meeng
  • Arno J. Knobbe
چکیده

Predefined pattern detection from time series is an interesting and challenging task. In order to reduce its computational cost and increase effectiveness, a number of time series representation methods and similarity measures have been proposed. Most of the existing methods focus on full sequence matching, that is, sequences with clearly defined beginnings and endings, where all data points contribute to the match. These methods, however, do not account for temporal and magnitude deformations in the data and result to be ineffective on several real-world scenarios where noise and external phenomena introduce diversity in the class of patterns to be matched. In this paper, we present a novel pattern detection method, which is based on the notions of templates, landmarks, constraints and trust regions. We employ the Minimum Description Length (MDL) principle for time series preprocessing step, which helps to preserve all the prominent features and prevents the template from overfitting. Templates are provided by common users or domain experts, and represent interesting patterns we want to detect from time series. Instead of utilising templates to match all the potential subsequences in the time series, we translate the time series and templates into landmark sequences, and detect patterns from landmark sequence of the time series. Through defining constraints within the template landmark sequence, we effectively extract all the landmark subsequences from the time series landmark sequence, and obtain a number of landmark segments (time series subsequences or instances). We model each landmark segment through scaling the template in both temporal and magnitude dimensions. To suppress the influence of noise, we introduce the concept of trust region, which not only helps to achieve an improved instance model, but also helps to catch the accurate boundaries of instances of the given template. Based on the similarities derived from instance models, we introduce the probability density function to calculate a similarity threshold. The threshold can be used to judge if a landmark segment is a true instance of the given template or not. To evaluate the effectiveness and efficiency of the proposed method, we apply it to two real-world datasets. The results show that our method is capable of detecting patterns of temporal and magnitude deformations with competitive performance. © 2015 Elsevier B.V. All rights reserved. ∗ Corresponding author at: Liacs, Leiden University, The Netherlands. E-mail addresses: [email protected] (S. Miao), [email protected] (U. Vespier), [email protected] (R. Cachucho), [email protected] (M. Meeng), [email protected] (A. Knobbe). http://dx.doi.org/10.1016/j.ins.2015.04.018 0020-0255/© 2015 Elsevier B.V. All rights reserved. S. Miao et al. / Information Sciences 329 (2016) 950–964 951

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identification of outliers types in multivariate time series using genetic algorithm

Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...

متن کامل

Traffic Condition Detection in Freeway by using Autocorrelation of Density and Flow

Traffic conditions vary over time, and therefore, traffic behavior should be modeled as a stochastic process. In this study, a probabilistic approach utilizing Autocorrelation is proposed to model the stochastic variation of traffic conditions, and subsequently, predict the traffic conditions. Using autocorrelation of the time series samples of density and flow which are collected from segments...

متن کامل

A Novel Method for Detection of Epilepsy in Short and Noisy EEG Signals Using Ordinal Pattern Analysis

Introduction: In this paper, a novel complexity measure is proposed to detect dynamical changes in nonlinear systems using ordinal pattern analysis of time series data taken from the system. Epilepsy is considered as a dynamical change in nonlinear and complex brain system. The ability of the proposed measure for characterizing the normal and epileptic EEG signals when the signal is short or is...

متن کامل

Anomaly Detection in Time Series of Chlorophyll Around the Time and Location of Large Coastal Earthquakes Using Random Forest Method

Earthquake is one of the most devastating natural hazards which efforts to predict the time, location and magnitude of it have not been yet completely successful. Remote Sensing data is proved to be an effective source of information about lithospheric and atmospheric activities around the impending earthquakes which are referred to as earthquake precursors. The issue of detecting anomalies in ...

متن کامل

Characterization of System Status Signals for Multivariate Time Series Discretization Based on Frequency and Amplitude Variation

Many fault detection methods have been proposed for monitoring the health of various industrial systems. Characterizing the monitored signals is a prerequisite for selecting an appropriate detection method. However, fault detection methods tend to be decided with user's subjective knowledge or their familiarity with the method, rather than following a predefined selection rule. This study inves...

متن کامل

On the Detection of Trends in Time Series of Functional Data

A sequence of functions (curves) collected over time is called a functional time series. Functional time series analysis is one of the popular research areas in which statistics from such data are frequently observed. The main purpose of the functional time series is to predict and describe random mechanisms that resulted in generating the data. To do so, it is needed to decompose functional ti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Inf. Sci.

دوره 329  شماره 

صفحات  -

تاریخ انتشار 2016